A Performance-Correctness Explicitly-Decoupled Architecture: Technical Report

نویسندگان

  • Alok Garg
  • Michael C. Huang
چکیده

Optimizing the common case has been an adage in decades of processor design practices. However, as the system complexity and optimization techniques’ sophistication have increased substantially, maintaining correctness under all situations, however unlikely, is contributing to the necessity of extra conservatism in all layers of the system design. The mounting process, voltage, and temperature variation concerns further add to the conservatism in setting operating parameters. Excessive conservatism in turn hurt performance and efficiency in the common case. However, much of the system’s complexity comes from advanced performance features and may not compromise the whole system’s functionality and correctness even if some components are imperfect and introduce occasional errors. We propose to separate performance goals from the correctness goal using an explicitly-decoupled architecture. In this paper, we discuss one such incarnation where an independent core serves as an optimistic performance enhancement engine that helps accelerate the correctness-guaranteeing core by passing high-quality predictions and performing accurate prefetching. The lack of concern for correctness in the optimistic core allows us to optimize its execution in a more effective fashion than possible in optimizing a monolithic core with correctness requirements. We show that such a decoupled design allows significant optimization benefits and is much less sensitive to conservatism applied in the correctness domain.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Exploring Performance-Correctness Explicitly-Decoupled Architectures

Optimizing the common case has been an adage in decades of processor design practices. However, as the system complexity and optimization techniques’ sophistication have increased substantially, maintaining correctness under all situations, however unlikely, is contributing to the necessity of extra conservatism in all layers of the system design. The mounting process, voltage, and temperature ...

متن کامل

A Decoupled Federate Architecture for Distributed Simulation Cloning

Distributed simulation cloning technology is designed to perform “what-if” analysis of existing High Level Architecture (HLA) based distributed simulations. The technology aims to enable the examination of alternative scenarios concurrently within the same simulation execution session. State saving and recovery are necessary for cloning a federate at runtime. However it is very difficult to hav...

متن کامل

Accelerating Decoupled Look-ahead via Weak Dependence Removal: A Metaheuristic Approach – Technical Report∗

Despite the proliferation of multi-core and multi-threaded architectures, exploiting implicit parallelism for a single semantic thread is still a crucial component in achieving high performance. Look-ahead is a tried-and-true strategy in uncovering implicit parallelism, but a conventional, monolithic out-of-order core quickly becomes resource-inefficient when looking beyond a small distance. A ...

متن کامل

Performance of the decoupled ACRI-1 architecture: the perfect club

This paper examines the performance potential of decoupled computer architectures on real-world codes, and includes the rst performance bounds calculations to be published for the highly-decoupled ACRI-1 computer architecture. It also constitutes the rst published work to report on the eeectiveness of a decoupling Fortran90 compiler. Decoupling is an architectural optimisation which ooers very ...

متن کامل

HiDISC: A Decoupled Architecture for Applications in Data Intensive Computing

The ever growing speed gap between processor and main memory has been a major performance bottleneck of modern computer systems. As a result, today’s data intensive applications suffer from frequent cache misses and lose many CPU cycles due to pipeline stalling. Although traditional prefetching methods reduce cache misses considerably, most of them strongly depend on the access pattern being pr...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008